Learning from errors in grapheme-to-phoneme conversion

نویسندگان

  • Tatyana Polyakova
  • Antonio Bonafonte
چکیده

In speech technology it is very important to have a system capable of accurately performing grapheme-to-phoneme (G2P) conversion, which is not an easy task especially if talking about languages like English where there is no obvious letter-phone correspondence. Manual rules so widely used before are now leaving the way open for the machine learning techniques and language independent tools. In this paper we present an extension of the use of transformation-based error-driven algorithm to G2P task. A set of explicit rules was inferred to correct the pronunciation for U.S. English, Spanish and Catalan using well-known machinelearning techniques in combination with transformation based algorithm. All methods applied in combination with transformation rules significantly outperform the results obtained by these methods alone.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Phoneme-to-grapheme conversion for out-of-vocabulary words in speech recognition

In this report, we show that Out-Of-Vocabulary items (OOVs), recognized using phoneme recognition, can be reasonably reliably transcribed orthographically using Machine Learning techniques. More specifically, (i) we show baseline performance of a machine learning approach to phoneme-to-grapheme conversion when different levels of artificial noise are added (simulating phoneme recognizer errors)...

متن کامل

Modified Grapheme Encoding and Phonemic Rule to Improve PNNR-Based Indonesian G2P

A grapheme-to-phoneme conversion (G2P) is very important in both speech recognition and synthesis. The existing Indonesian G2P based on pseudo nearest neighbour rule (PNNR) has two drawbacks: the grapheme encoding does not adapt all Indonesian phonemic rules and the PNNR should select a best phoneme from all possible conversions even though they can be filtered by some phonemic rules. In this p...

متن کامل

Decision Tree Learning for Automatic Grapheme to Phoneme Conversion for Tamil N.Udhyakumar, C.S.Kumar, R.Srinivasan and R.Swaminathan

This paper describes a novel approach for grapheme to phoneme conversion using decision tree learning technique. The proposed approach, unlike the rule based approach, can generate rules spanning wider context and thus give better accuracy for the conversion.

متن کامل

Optimizing phoneme-to-grapheme conversion for out-of-vocabulary words in speech recognition

In this report, we present the results of further research on phoneme-to-grapheme (P2G) conversion for Out-Of-Vocabulary items (OOVs), recognized using phoneme recognition, in large vocabulary speech recognition. First, we summarize the results of previous research, and then we start with reporting on several optimization strategies for the Machine Learning technique we used to carry out P2G co...

متن کامل

Training grapheme to phoneme conversion in patients with oral reading and naming deficits: A model-based approach

A model-based treatment focused on improving grapheme to phoneme conversion as well as phoneme to grapheme conversion was implemented to train oral reading skills in two patients with severe oral reading and naming deficits. Initial assessment based on current cognitive neuropsychological models of naming indicated a deficit in the phonological output lexicon and in grapheme to phoneme conversi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006